Goto

Collaborating Authors

 represent information


Information Structure in Mappings: An Approach to Learning, Representation, and Generalisation

Conklin, Henry

arXiv.org Artificial Intelligence

Despite the remarkable success of large large-scale neural networks, we still lack unified notation for thinking about and describing their representational spaces. We lack methods to reliably describe how their representations are structured, how that structure emerges over training, and what kinds of structures are desirable. This thesis introduces quantitative methods for identifying systematic structure in a mapping between spaces, and leverages them to understand how deep-learning models learn to represent information, what representational structures drive generalisation, and how design decisions condition the structures that emerge. To do this I identify structural primitives present in a mapping, along with information theoretic quantifications of each. These allow us to analyse learning, structure, and generalisation across multi-agent reinforcement learning models, sequence-to-sequence models trained on a single task, and Large Language Models. I also introduce a novel, performant, approach to estimating the entropy of vector space, that allows this analysis to be applied to models ranging in size from 1 million to 12 billion parameters. The experiments here work to shed light on how large-scale distributed models of cognition learn, while allowing us to draw parallels between those systems and their human analogs. They show how the structures of language and the constraints that give rise to them in many ways parallel the kinds of structures that drive performance of contemporary neural networks.


Ushering in the third wave of AI

#artificialintelligence

Today, artificial Intelligence (AI) helps you shop, provides suggestions on what music to listen to and what shows to watch, connects you with friends on social media and even drives your car. As more companies focus their efforts on AI-based solutions, 2020 is shaping up to be a turning point as we begin to witness the third wave of AI -- when AI systems not only not learn and reason as they encounter new tasks and situations, but have the ability to explain their decision making. The first wave of AI focused on enabling reasoning over narrowly defined problems, but lacked any learning capability and poorly handled uncertainty. Financial products like Turbotax and Quickbooks, for example, are able to take information from a situation where rules have previously been defined and work through it to achieve a desired outcome. However, they are unable to operate beyond the previously defined rules.


New Evidence for the Geometry of Thought - Facts So Romantic

Nautilus

In 2014, the Swedish philosopher and cognitive scientist Peter Gärdenfors went to Krakow, Poland, for a conference on the mind. He was to lecture at Jagiellonian University, courtesy of the Copernicus Center for Interdisciplinary Studies, on his theory of conceptual, or "cognitive," spaces. Gärdenfors had been working on his idea of cognitive spaces, which explain how our brains represent concepts and objects, for decades. In his book Conceptual Spaces, from 2000, he wrote, "It has long been a common prejudice in cognitive science that the brain is either a Turing machine working with symbols or a connectionist system using neural networks." In Krakow, Gärdenfors pushed against that prejudice. In his talk, "The Geometry of Thinking," he suggested that humans are able to do things that today's powerful computers can't do--like learn language quickly and generalize from particulars with ease (to see, in other words, without much training, that lions and tigers are four-legged felines)--because we, unlike our computers, represent information in geometrical space.


On the Dimensionality of Embeddings for Sparse Features and Data

Naumov, Maxim

arXiv.org Machine Learning

In this note we discuss a common misconception, namely that embeddings are always used to reduce the dimensionality of the item space. We show that when we measure dimensionality in terms of information entropy then the embedding of sparse probability distributions, that can be used to represent sparse features or data, may or not reduce the dimensionality of the item space. However, the embeddings do provide a different and often more meaningful representation of the items for a particular task at hand. Also, we give upper bounds and more precise guidelines for choosing the embedding dimension.


The Importance of Brain Theory in True Machine Intelligence - insideBIGDATA

#artificialintelligence

Brains have been associated with the field of artificial intelligence for more than half a century. The reason is simple: the brain is the best and perhaps only example we have of an intelligent system. But should the brain serve as mere inspiration or can it be a roadmap, providing the most efficient path to machine intelligence? More than 50 years ago artificial neural networks, or ANNs for short, were created with the intent of designing something that, like the brain, could learn without expert rules or human supervision. However, ANNs were designed at a time when little was known about how neurons worked in the brain.